Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech: Speech Understanding / Interaction

نویسندگان

  • David Chow
  • Waleed H. Abdulla
چکیده

Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker identification system. An F-ratio feature analysis was conducted on both the LAR and MFCC feature vectors which showed the lower order LAR coefficients are superior to MFCC counterpart. The textindependent, closed-set speaker identification rate, as tested on the downsampled version of TIMIT database, was improved from 96.73%, using the MFCC feature, to 98.81%, using the LAR features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty

In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE  estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of  noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...

متن کامل

Robust text-independent speaker identification using Gaussian mixture speaker models

This paper introduces and motivates the use of Gaussian mixture models (CMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are efTective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance ...

متن کامل

Text-independent speaker identification using Gaussian mixture bigram models

In this paper, a novel speaker modeling technique based on Gaussian mixture bigram model (GMBM) is introduced and evaluated for text-independent speaker identification (speaker-ID). GMBM is a stochastic framework that explores the context or time dependency of continuous observations from an information source. In view of the fact that speech features are correlated between successive frames, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004